DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4
Hits 1 – 20 of 70

1
Automatic Dialect Density Estimation for African American English ...
BASE
Show details
2
DIALKI: Knowledge Identification in Conversational Systems through Dialogue-Document Contextualization ...
BASE
Show details
3
Dialogue State Tracking with a Language Model using Schema-Driven Prompting ...
BASE
Show details
4
A Controllable Model of Grounded Response Generation ...
BASE
Show details
5
Neural Models for Integrating Prosody in Spoken Language Understanding
Tran, Trang. - 2020
BASE
Show details
6
Automatic Analysis of Language Use in K-16 STEM Education and Impact on Student Performance
Nadeem, Farah. - 2020
BASE
Show details
7
Asynchronous Speech Recognition Affects Physician Editing of Notes
Lybarger, Kevin J.; Ostendorf, Mari; Riskin, Eve. - : Georg Thieme Verlag KG, 2018
BASE
Show details
8
Low-Rank RNN Adaptation for Context-Aware Language Modeling
Jaech, Aaron. - 2018
BASE
Show details
9
Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information ...
BASE
Show details
10
Effective Use of Cross-Domain Parsing in Automatic Speech Recognition and Error Detection
Marin, Marius. - 2015
BASE
Show details
11
Automatic Characterization of Text Difficulty
Medero, Julie. - 2014
BASE
Show details
12
Data Selection for Statistical Machine Translation
Abstract: Thesis (Ph.D.)--University of Washington, 2014 ; Machine translation, the computerized translation of one human language to another, could be used to communicate between the thousands of languages used around the world. Statistical machine translation (SMT) is an approach to building these translation engines without much human intervention, and large-scale implementations by Google, Microsoft, and Facebook in their products are used by millions daily. The quality of SMT systems depends on the example translations used to train the models. Data can come from a variety of sources, many of which are not optimal for common specific tasks. The goal is to be able to find the right data to use to train a model for a particular task. This work determines the most relevant subsets of these large datasets with respect to a translation task, enabling the construction of task-specific translation systems that are more accurate and easier to train than the large-scale models. Three methods are explored for identifying task-relevant translation training data from a general data pool. The first uses only a language model to score the training data according to lexical probabilities, improving on prior results by using a bilingual score that accounts for differences between the target domain and the general data. The second is a topic-based relevance score that is novel for SMT, using topic models to project texts into a latent semantic space. These semantic vectors are then used to compute similarity of sentences in the general pool to to the target task. This work finds that what the automatic topic models capture for some tasks is actually the style of the language, rather than task-specific content words. This motivates the third approach, a novel style-based data selection method. Hybrid word and part-of-speech (POS) representations of the two corpora are constructed by retaining the discriminative words and using POS tags as a proxy for the stylistic content of the infrequent words. Language models based on these representations can be used to quantify the underlying stylistic relevance between two texts. Experiments show that style-based data selection can outperform the current state-of-the-art method for task-specific data selection, in terms of SMT system performance and vocabulary coverage. Taken together, the experimental results indicate that it is important to characterize corpus differences when selecting data for statistical machine translation.
Keyword: Computer science; data selection; electrical engineering; language modeling; machine translation; natural language processing; topic modeling
URL: http://hdl.handle.net/1773/26146
BASE
Hide details
13
Graph-based query strategies for active learning
In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 21 (2013) 2, 260-269
OLC Linguistik
Show details
14
Rank and Sparsity in Language Processing
BASE
Show details
15
Joint reranking of parsing and word recognition with automatic segmentation
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 26 (2012) 1, 1-19
BLLDB
OLC Linguistik
Show details
16
Graph-based Algorithms for Lexical Semantics and its Applications
Wu, Wei. - 2012
BASE
Show details
17
Expected dependency pair match: predicting translation quality with expected syntactic structure
In: Machine translation. - Dordrecht [u.a.] : Springer Science + Business Media 23 (2010) 2-3, 169-179
BLLDB
OLC Linguistik
Show details
18
A machine learning approach to reading level assessment
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 23 (2009) 1, 89-106
OLC Linguistik
Show details
19
A machine learning approach to reading level assessment
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 23 (2009) 1, 89-106
BLLDB
OLC Linguistik
Show details
20
Improving robustness of MLLR adaptation with speaker-clustered regression class trees
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 23 (2009) 2, 176-199
BLLDB
OLC Linguistik
Show details

Page: 1 2 3 4

Catalogues
4
0
17
0
0
1
0
Bibliographies
27
0
0
0
0
0
0
0
1
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
34
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern